home *** CD-ROM | disk | FTP | other *** search
- Path: engnews1.Eng.Sun.COM!taumet!clamage
- From: "Constantine Antonovich:" <const@Orbotech.Co.IL>
- Newsgroups: comp.std.c++
- Subject: operators new[]/delete[]
- Date: 26 Feb 1996 16:07:51 GMT
- Organization: Orbotech ltd.
- Approved: clamage@eng.sun.com (comp.std.c++)
- Message-ID: <313166CF.11C2@orbotech.co.il>
- NNTP-Posting-Host: taumet.eng.sun.com
- Mime-Version: 1.0
- Content-Type: text/plain; charset=us-ascii
- Content-Transfer-Encoding: 7bit
- X-Nntp-Posting-Host: orange.orbotech.co.il
- X-Mailer: Mozilla 2.0 (X11; I; SunOS 5.5 sun4c)
- Originator: clamage@taumet
-
- Several days ago I sent to this newsgroup the following code
- (with a question is the code incorrect according to current ANSI
- standard or do I have a bug in my compiler):
-
- //----------------------------------------------------------
- #include <iostream.h>
- #include <assert.h>
- #include <new.h>
-
- class A {
- public:
- A(void) { cout << "A constructed" << endl; }
- ~A() { cout << "A destructed" << endl; }
- };
-
- class B {
- public:
- B(void) { cout << "B constructed" << endl; }
- ~B() { cout << "B destructed" << endl; }
- };
-
- A* foo_allocate(unsigned size)
- {
- assert(sizeof(A)==sizeof(B));
-
- B* bptr=new B[size];
- A* aptr=(A*)bptr; // A* aptr=reinterpret_cast<A*>bptr; is more correct
- // but my compilers do not support that yet.
- for (unsigned j=size; j>0;) // this place corrected according
- (bptr+(--j))->~B(); // to remark of a person whose name
- // I have lost to my regret.
- for (unsigned i=0; i<size; ++i)
- new(aptr+i) A;
-
- return aptr;
- }
-
- int main(void)
- {
- A* arr=foo_allocate(2);
- delete [] arr; # here
- return 0;
- }
- //----------------------------------------------------------
-
- I considered as a problematic point the fact that one of my
- compilers treats line marked "#here" as destruction of array of B
- objects (B class destructors called).
-
- I have received several answers (I send this code also
- personally to Steve Clamage and he kindly answered me) concluding
- that according to the standards, the program contains operations
- with undefined result and so cannot be considered as a correct
- one. Meanwhile after some time of reflections I am going to insist
- on the following:
-
- -- ANSI work papers (April, 1995) leave interpretation of the
- correctness of the above-mentioned code ambiguous;
- -- If so, it's necessary to define this matter more precisely
- to eliminate differences of the interpretation by different
- compiler vendors;
- -- The code can be interpreted as representing an absolutely
- correct and well-defined behavior.
-
- In the following text, I am going to prove this point of view.
-
-
- 1. Little history of the code sample.
-
- The above-mentioned code sample can be considered as a play of
- imagination without any applicable weight. Meanwhile this code was
- born from another one making little more sense.
- Some times ago, I noticed periodically appearing discussions
- about necessity of renew operation in C++. Generally, I never felt
- myself out of my share because of renew nonexistence but seeing
- arguments of its usefulness again and again, I started to think:
- what the hell is its problem, if it's not possible to implement it
- by means of the language itself?
- So I wrote the following example, trying to avoid redundant
- operations usually existing in reallocation on classic manner:
-
- //----------------------------------------------------------
- template<class T>
- class Allocator {
- private:
- struct filler { char filler_[sizeof(T)]; };
- public:
- static T* allocate_array(unsigned elm);
- static void dup_array(T* dst, T* src, unsigned elm);
- static void fill_array(T* dst, unsigned elm);
- };
-
- template<class T>
- T* Allocator<T>::allocate_array(unsigned elm)
- {
- return (T*)new filler[elm];
- }
-
- template<class T>
- void Allocator<T>::dup_array(T* dst, T*src, unsigned elm)
- {
- for (unsigned i=0; i<elm; ++i)
- new(dst+i) T(src[i]);
- }
-
- template<class T>
- void Allocator<T>::fill_array(T* dst, unsigned elm)
- {
- for (unsigned i=0; i<elm; ++i)
- new(dst+i) T;
- }
-
- template<class T>
- T* realloc(T*& array, unsigned old_size, unsigned new_size)
- {
- set_new_handler(0);
- T* tmp=Allocator<T>::allocate_array(new_size);
- if (tmp) {
- if (new_size>old_size) {
- Allocator<T>::dup_array(tmp,array,old_size);
- Allocator<T>::fill_array(rmp+old_size,new_size-old_size);
- }
- else
- Allocator<T>::dup_array(tmp,array,new_size);
- delete [] array; array=tmp;
- }
- return tmp;
- }
- //----------------------------------------------------------
-
- The general idea was to allocate an array with no
- initialization of its objects and just to use copy constructor to
- copy old elements into the newly allocated array (avoiding
- redundant creation of objects, with their default constructor,
- when they are immediately replaced by the following assignment).
- There is no problems of such technique usage in container classes
- where all memory and object management are absolutely hided from
- the user, but the imaginary renew operation should be applicable
- to regular arrays like:
-
- //----------------------------------------------------------
- A* ap=new A[4];
- realloc(ap,4,8);
- delete [] ap;
- //----------------------------------------------------------
-
- Obviously the code should be contained in
- Allocator<T>::allocate_array function produces the mentioned
- problem.
-
- 2. Undefined behavior.
-
- Of course, some constructions in a possible program may
- produce undefined behavior. Nevertheless, even undefined behavior
- should have some definition. Let's consider the following code:
-
- //----------------------------------------------------------
- A* ap=new A;
- B* bp=reinterpret_cast<B*>(ap);
- delete bp; // #here
- //----------------------------------------------------------
-
- Without any doubt, result of the code executed in "#here" line is
- undefined. But definitely I wouldn't like, as a result of
- uncertainty of the behavior, my compiler to send email complaint
- to some League "C++ compilers against stupidity of the
- programmers". Also I wouldn't like my compiler to recognize
- incorrectness of the code and silently to call A destructor (after
- all, bp points to A object, isn't it?) instead of B one. Here I
- mean that
-
- UNDEFINED BEHAVIOR HAS AN ERROR MEANING.
- UNDEFINED BEHAVIOR ALWAYS RESULTING IN CORRECT EXECUTION OF
- A CODE WITH UNDEFINED BEHAVIOR, IS FORBIDDEN.
-
- In the previous example, we are dealing with a code with
- undefined behavior. The code is obviously incorrect. Meanwhile, in
- case of an imaginary compiler which can recognize true type of an
- object, the compiler could call A destructor in line marked
- "#here" (because its behavior whould be undefined anyway). In
- such a case, the invalid code will be correctly executes in any
- case (ALWAYS) and so the behavior of the compiler cannot be
- considered as a proper one.
- C++, partly by itself, partly as heir of C, stands on the
- principles of not stinting of a programmer in correctness of his
- actions if they don't contradict syntactical correctness. So,
- in the example, in the line marked "#here", the imaginary compiler
- should honestly try to destroy B object and to free its
- memory. Applying of B destructor to A object most probably will
- cause "undefined behavior", but its harm will depend on particular
- A and B classes (and obviously such applying will be not harmless
- ALWAYS).
-
-
- 3. No kidding.
-
- C++ is not wizard language. Generally, its behavior is
- understandable, enough clear and well predictable. Creation of C++
- objects consists of two parts: memory allocation and object
- construction by itself. Even if such separation into two
- independent parts is not obvious and is not proclaimed by ANSI
- draft straightly, that doesn't change anything because such
- separation results from the language definition anyway.
- C++ memory management is ALMOST ALWAYS typeless. Here "ALMOST
- ALWAYS" stands for all cases covered by standard-conforming
- language implementation except of denumerable number of cases
- where a programmer explicitly changes the language behavior by
- means of the language constructions (and, I should add, on his own
- responsibility). To illustrate the term, the following example can
- be considered:
-
- //----------------------------------------------------------
- A* ap=new A;
- //----------------------------------------------------------
-
- What does the code do? Obviously, it creates a new object of type
- A. Yes, but I should say it creates a new object of type A ALMOST
- ALWAYS, just because the definition of A class may be as follows:
-
- //----------------------------------------------------------
- class A {
- // some stuff
- public:
- // some stuff
- void* operator new(unsigned) { exit(1); } // not for heap usage
- };
- //----------------------------------------------------------
-
- and in such case obviously the previous statement will not create
- any A object.
- C++ memory management is ALMOST ALWAYS typeless because
- default operators new and new[] has no knowledge about type of
- object they allocate memory for. From the other hand, these
- operators are the single C++ mechanism to manage the
- memory. Moreover, this and only this part of object creation can
- be absolutely overloaded by a programmer and that absolutely
- separates it to independent stage of object creation.
- An object construction has hidden features (like
- initialization of tables of virtual functions) and only partly
- (constructors and destructors) can be influenced by a programmer.
- Meanwhile, declaration in the ANSI standard "placement new" also
- had finished separation of object construction into independent
- part since an object can be created with no allocation of memory
- (at any place and by the programmer, not only on the stack) and
- can be legally destroyed with no freeing of the memory (by direct
- call to its destructor).
- If we recall that according to C++ principles there should be
- no difference between objects and their behavior regardless their
- placement we have to agree that allocation of memory for an object
- and construction of the object in the allocated memory represent
- two independent stages ALMOST ALWAYS.
- Taking all this into account, even definitions of operator new
- and delete can be reconsidered to eliminate number of duplicated
- definitions, for example:
-
- new T(<arg-list>);
- represents shorthand of sequence
- new(::operator new(sizeof T)) T(<arg-list>);
- or
- new(T::operator new(sizeof T)) T(<arg-list>);
- if T::operator new is defined.
-
- delete tp; // there tp is non-null pointer on object of type T
- represents shorthand of atomic sequence
- if (tp) { tp->~T(); ::operator delete(<cast-to-mostly-base>tp); }
- or
- if (tp) { tp->~T(); T::operator delete(<cast-to-mostly-base>tp); }
- if T::operator delete is defined.
-
- Actually, similar redefinition can be done for new[]/delete[]
- also.
-
- 4. Alignment and memory allocation.
-
- In the starting the article example, the following code
-
- //----------------------------------------------------------
- assert(sizeof(A)==sizeof(B));
-
- B* bptr=new B[size];
- A* aptr=(A*)bptr;
- //----------------------------------------------------------
-
- really seems dangerous.
-
- Fergus Henderson writes:
- "This assertion is not guaranteed to succeed.
- It would take an extremely perverse implementation
- for it to fail, however, so I think it would be very
- portable, even though it is not strictly guaranteed
- to work."
-
- This sentence seems to be reasonable but, in deal, this assertion
- guarantees the correctness ALMOST ALWAYS and under that
- circumstance this check is absolutely portable.
-
- ANSI draft says:
- 18.4.1.1 Single-object forms
- Effects:
- The allocation function called by a new-expression to
- allocate size bytes of storage suitably aligned to
- represent any object of that size.
-
- 18.4.1.2 Array forms
- Effects:
- The allocation function called by the array form of
- a new-expression to allocate size bytes of storage
- suitably aligned to represent any array object of that
- size or smaller.32)
-
- We see that ANSI draft says that allocated memory should be
- suitably aligned for any object and any array object with the
- single limitation of size. We can also recall C++ pointer
- arithmetic and what is sizeof of some particular object (which
- contains concept of alignment in the object itself).
- An implementation hasn't to be extremely perverse the
- assertion condition to fail. It can be very simple one where B
- class for example has its own operator new[] allocating memory in
- specific alignment suitable for B but not for any other class and
- A one particularly (and even that is impossible for compilers
- still not supporting overloading of operator new[]).
- But this situation is exactly "ALMOST ALWAYS" case. If I am
- taking responsibility to overload operator new[] for specific
- class, it's also my responsibility to take a care of usage of such
- operations like one I am doing with the condition of the equation
- of object sizes.
-
- The language should stand in "ALMOST ALWAYS"
- correctness (and de facto it does that). If a
- programmer is doing something that is ALMOST
- ALWAYS correct, the language should demonstrate
- behavior like that is ALWAYS correct. Since ALMOST
- ALWAYS correct action may became incorrect only as
- a result of a programmer activity, this is also
- responsibility of the programmer to take a care of
- usage of such actions.
-
-
- Fergus Henderson continues:
- //----------------------------------------------------------
- B* bptr=new B[size];
- A* aptr=(A*)bptr;
- //----------------------------------------------------------
- "This cast has unspecified behavior. (See 5.2.9
- [expr.cast.reinterpret]/8.). However, I would
- expect it to work on most implementations."
-
- This article of ANSI draft interprets the operation as
- unspecified in case of cast from T1 to T2 and back and if there is
- difference in alignment of T1 and T2. Obviously, that is not our
- case (at least because definition of allocation function returning
- suitable for any object alignment).
-
-
-
- 5. Rest in peace.
-
- The following peace of the code has been considered as clear
- by all experts:
-
- //----------------------------------------------------------
- for (unsigned j=size; j>0;)
- (bptr+(--j))->~B();
-
- for (unsigned i=0; i<size; ++i)
- new(aptr+i) A;
- //----------------------------------------------------------
-
-
- 6. Undefined behavior (continue).
-
- All experts have considered deletion of the array allocated in
- so strange manner as mostly incorrect point with undefined
- behavior.
-
- Steve Clamage writes:
- "You do have an operation with undefined results,
- however. In effect you are doing this:
- A* p = (A*)new B[2];
- delete [] p;
- The rule is that the type of the pointer passed
- to delete[] must match the type of the pointer
- returned by new[], which is not the case here.
- The compiler is not required to diagnose the error,
- and the language definition does not say what the
- result is."
-
- Definitely, I am not doing that. If the standard of the
- language enables to interpret my code in such manner then there is
- something wrong with the standard! But even if the behavior is
- proclaimed to be undefined, I would like to remind what I have
- said in 2-nd paragraph. If the uncertainty of the behavior was
- properly defined then the code wouldn't have undefined behavior!
- (It would be very interesting to test the code on some another
- compilers. I may suppose that the code has, in deal, enough
- defined behavior de facto as result of most logical implementation
- of operators new/delete and just SPARCompiler C++, for unknown
- reason, stores pointer to destructor function together with array
- size).
-
- Fergus Henderson agrees with Steve Clamage:
- "This has undefined behaviour. It contravenes 5.3.5
- [expr.delete]/2, which says that the expression
- passed to `delete []' must be a pointer to the first
- element of an array of objects allocated with `new []';
- this is not the case, because although there once was
- such an array at that memory location, its lifetime
- ended when the memory was overwritten by the calls
- to placement new (see 3.8[basic.life]/1)."
-
- There is at least one self-contradictory point in that
- conclusion. Of course, lifetime of all B objects had been ended,
- by why does that mean end of the array life? Or in contrary, if
- end of life-time of B objects means end of life-time of the array,
- so creation of A objects should mean creation of new array,
- shouldn't it?.
-
- Article 5.3.5.2 of ANSI draft says something slightly
- different:
- "...In the second alternative (delete array), the value
- of the operand of delete shall be a pointer to an
- array created by a new-expression without a new-placement
- specification. If not, the behavior is undefined."
-
- So delete takes as its argument POINTER TO ARRAY (even objects
- are not mentioned). No one says that
- ...pointer passed to delete[] must match the type
- of the pointer returned by new[]...
- ...the expression passed to `delete []' must be a
- pointer to the first element of an array of objects
- allocated with `new []'...
- All that already means INTERPRETATION of the standard and also
- that the standard enables such interpretations. Meanwhile C++
- memory management seems not to need so strong restrictions just
- because the memory can be managed separately from objects
- construction/destruction and can be reused without reallocation.
- I agree that all above said regarding operators new/delete is
- point of view of common sense (one should use delete and delete[]
- with the same pointer to the same type he got from new and new[]
- and not play with the pointers in 99.9999% cases he uses
- new/delete at all and in 100% cases if he doesn't understand what
- he is doing) but that has nothing common with boundaries of proper
- language processing.
- And here we really arrive to the final point. ANSI draft gives
- no strong array definition to disable ambiguous array
- interpretation. And above-mentioned common-sense based array
- understanding has all rights to exist.
-
- 7. No kidding (continue).
-
- I suppose that this ambiguity in interpretation of arrays and
- operators new/delete should be eliminated from the standard. I
- would propose the following additions in supposition that they:
-
- -- do not conflict with any of previous standard's
- definitions;
- -- do not change nothing in the standard's common principles
- and common understanding of the standard except of very
- specific point with no influence upon the standard itself;
- -- do not influence mostly on existing implementations of
- the language since some implementations use this idea de
- facto and others can easily be corrected;
- -- do not influence mostly on existing C++ applications
- because they concern some very specific point in the
- standard with not common and extremely rarely (if any)
- usage.
- -- will make the standard more logically completed.
-
- Addition to array definition [dcl.array]:
- Array of N T object represents contiguous amount of
- memory of suitable size and alignment with N
- non-overlapping objects of type T placed into the
- memory with no gaps and each properly aligned.
-
- Addition to operator delete [expr.delete] ("above" here
- means all previously said by the standard):
- In either alternative, the type of the deleted object
- is evaluated as described above and according to the
- type of the actual operand.
-
- In my opinion, the additions may be considered as an
- overweight but the experience shows they are not.
-
- 8. Renew
-
- In my opinion, the previously mentioned additions would make
- starting the article example absolutely clear and ALMOST ALWAYS
- correct with no discussions (I continue to insist that the example
- is so even now but with discussions). But what about renew? C++
- standard very hardly accepts new keywords and new features. But
- may be it makes sense, at least for completeness, to add to the
- standard's (enough) new family of various_cast<T> something like
- following:
-
- dynamic_sizeof(object); // should return sizeof evaluated in
- // run time, for example:
-
- //----------------------------------------------------------
- class A {
- public:
- unsigned u;
- A();
- virtual ~A();
- };
-
- class B : public A {
- public:
- int i;
- B();
- ~B();
- };
-
- int main(void)
- {
- A* a=new B;
- cout << sizeof(A) << endl; // types 8
- cout << sizeof(B) << endl; // types 12
- cout << sizeof(*a) << endl; // types 8
- cout << dynamic_sizeof(*a) << endl; // types 12
- return 0;
- }
- //----------------------------------------------------------
-
-
- array_sizeof(pointer); // should return sizeof of an array, for example
-
- //----------------------------------------------------------
- class A {
- public:
- unsigned u;
- A();
- };
-
- int main(void)
- {
- A* a=new A;
- A ar[2];
- A* ap=new A[4];
-
- cout << array_sizeof(a)/sizeof(A) << endl; // types 0
- cout << array_sizeof(ar)/sizeof(A) << endl; // types 2
- cout << array_sizeof(ap)/sizeof(A) << endl; // types 4
- return 0;
- }
- //----------------------------------------------------------
-
- I may suppose that usage of arrays of objects with no
- destructors (where a compiler may optimize away storage of number
- of elements) became enough rare in contemporary C++ (and in lot of
- cases, number of elements in such arrays is known already at
- compilation time) so the language may enough easily provide a
- programmer with such information as number of elements in an
- array.
-
- --
- //------------------------------------------------------------------
- // Opinions expressed here are my own only
- // Constantine Antonovich const@orbotech.co.il
- //------------------------------------------------------------------
- [ To submit articles: Try just posting with your newsreader.
- If that fails, use mailto:std-c++@ncar.ucar.edu
- FAQ: http://reality.sgi.com/employees/austern_mti/std-c++/faq.html
- Policy: http://reality.sgi.com/employees/austern_mti/std-c++/policy.html
- Comments? mailto:std-c++-request@ncar.ucar.edu
- ]
-